[SPARK-14127][SQL] Native "DESC [EXTENDED | FORMATTED] <table>" DDL command by liancheng · Pull Request #12844 · apache/spark

liancheng · 2016-05-02T17:47:47Z

What changes were proposed in this pull request?

This PR implements native DESC [EXTENDED | FORMATTED] <table> DDL command. Sample output:

scala> spark.sql("desc extended src").show(100, truncate = false)
+----------------------------+---------------------------------+-------+
|col_name                    |data_type                        |comment|
+----------------------------+---------------------------------+-------+
|key                         |int                              |       |
|value                       |string                           |       |
|                            |                                 |       |
|# Detailed Table Information|CatalogTable(`default`.`src`, ...|       |
+----------------------------+---------------------------------+-------+


scala> spark.sql("desc formatted src").show(100, truncate = false)
+----------------------------+----------------------------------------------------------+-------+
|col_name                    |data_type                                                 |comment|
+----------------------------+----------------------------------------------------------+-------+
|key                         |int                                                       |       |
|value                       |string                                                    |       |
|                            |                                                          |       |
|# Detailed Table Information|                                                          |       |
|Database:                   |default                                                   |       |
|Owner:                      |lian                                                      |       |
|Create Time:                |Mon Jan 04 17:06:00 CST 2016                              |       |
|Last Access Time:           |Thu Jan 01 08:00:00 CST 1970                              |       |
|Location:                   |hdfs://localhost:9000/user/hive/warehouse_hive121/src     |       |
|Table Type:                 |MANAGED                                                   |       |
|Table Parameters:           |                                                          |       |
|  transient_lastDdlTime     |1451898360                                                |       |
|                            |                                                          |       |
|# Storage Information       |                                                          |       |
|SerDe Library:              |org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe        |       |
|InputFormat:                |org.apache.hadoop.mapred.TextInputFormat                  |       |
|OutputFormat:               |org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat|       |
|Num Buckets:                |-1                                                        |       |
|Bucket Columns:             |[]                                                        |       |
|Sort Columns:               |[]                                                        |       |
|Storage Desc Parameters:    |                                                          |       |
|  serialization.format      |1                                                         |       |
+----------------------------+----------------------------------------------------------+-------+

How was this patch tested?

A test case is added to HiveDDLSuite to check command output.

SparkQA · 2016-05-02T18:56:42Z

Test build #57539 has finished for PR 12844 at commit 718da25.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-05-03T05:30:43Z

Test build #57597 has finished for PR 12844 at commit 803f28e.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-05-03T05:51:56Z

Test build #57601 has finished for PR 12844 at commit 0bc9f5a.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

liancheng · 2016-05-03T07:56:05Z

sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala

A typo bug, not related to this PR.

Oh wait... It's actually a typo from Hive... 😵 “Fixing" it fails existing test case.

- Shows partition columns for EXTENDED and FORMATTED - Shows "Compressed:" field - Shows data types in lower case

SparkQA · 2016-05-03T09:10:56Z

Test build #57613 has finished for PR 12844 at commit 9194fe1.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-05-03T10:47:37Z

Test build #57621 has finished for PR 12844 at commit a66885a.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-05-03T12:53:17Z

Test build #57630 has finished for PR 12844 at commit 18b9bb5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

hvanhovell · 2016-05-03T21:01:35Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala

    inputFormat: Option[String],
    outputFormat: Option[String],
    serde: Option[String],
+    compressed: Boolean,


Is this ever true? If it isn't we could leave it out.

Nvm. Hive can pass compressed tables.

hvanhovell · 2016-05-03T21:16:19Z

LGTM

SparkQA · 2016-05-04T08:15:58Z

Test build #57729 has finished for PR 12844 at commit b5dbc15.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

liancheng · 2016-05-04T08:42:42Z

Thanks for the review! I'm merging this to master and branch-2.0.

…data source tables ## What changes were proposed in this pull request? This is a follow-up of PR #12844. It makes the newly updated `DescribeTableCommand` to support data sources tables. ## How was this patch tested? A test case is added to check `DESC [EXTENDED | FORMATTED] <table>` output. Author: Cheng Lian <lian@databricks.com> Closes #12934 from liancheng/spark-14127-desc-table-follow-up. (cherry picked from commit 671b382) Signed-off-by: Yin Huai <yhuai@databricks.com>

…data source tables ## What changes were proposed in this pull request? This is a follow-up of PR #12844. It makes the newly updated `DescribeTableCommand` to support data sources tables. ## How was this patch tested? A test case is added to check `DESC [EXTENDED | FORMATTED] <table>` output. Author: Cheng Lian <lian@databricks.com> Closes #12934 from liancheng/spark-14127-desc-table-follow-up.

gatorsmile · 2016-05-12T01:15:21Z

sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala

+    describe(relation, buffer)
+
+    append(buffer, "", "", "")
+    append(buffer, "# Detailed Table Information", relation.catalogTable.toString, "")


@liancheng To improve the output of Explain, I plan to change the default implementation of toString of case class CatalogTable. That will also affect the output of Describe Extended.

I checked what Hive did for the command Describe Extended, as follows.

Detailed Table Information Table(tableName:t1, dbName:default, owner:root, createTime:1462627092, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:col, type:int, comment:null)], location:hdfs://6b68a24121f4:9000/user/hive/warehouse/t1, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{transient_lastDdlTime=1462627092}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)

Basically, in the implementation of toString, I will try to follow what you did in describeFormatted but the contents will be in a single line. Feel free to let me know if you have any concern or suggestion. Thanks!

liancheng added 2 commits May 3, 2016 01:39

Implements native 'DESC TABE' DDL command

89e0aea

Test case

718da25

liancheng force-pushed the spark-14127-desc-table branch from 803f28e to 0bc9f5a Compare May 3, 2016 05:42

liancheng reviewed May 3, 2016
View reviewed changes

Fixes test failures

9194fe1

- Shows partition columns for EXTENDED and FORMATTED - Shows "Compressed:" field - Shows data types in lower case

liancheng force-pushed the spark-14127-desc-table branch from 0bc9f5a to 9194fe1 Compare May 3, 2016 08:08

Reverts the "typo" fix

a66885a

Uses .simpleString instead of .sql when describing column data types

18b9bb5

hvanhovell reviewed May 3, 2016
View reviewed changes

Addresses PR comment

b5dbc15

asfgit closed this in f152fae May 4, 2016

srowen mentioned this pull request May 4, 2016

[SPARK-14127][SQL][WIP] Describe table #12460

Closed

liancheng deleted the spark-14127-desc-table branch May 4, 2016 15:19

liancheng mentioned this pull request May 5, 2016

[SPARK-14127][SQL] Makes 'DESC [EXTENDED|FORMATTED] <table>' support data source tables #12934

Closed

liancheng mentioned this pull request May 10, 2016

[SPARK-14127][SQL] "DESC <table>": Extracts schema information from table properties for data source tables #13025

Closed

liancheng mentioned this pull request May 11, 2016

[SPARK-14346][SQL] Show Create Table (Native) #12579

Closed

gatorsmile reviewed May 12, 2016
View reviewed changes

dongjoon-hyun mentioned this pull request Sep 25, 2016

[SPARK-17612][SQL] Support DESCRIBE table PARTITION SQL syntax #15168

Closed

peter-toth mentioned this pull request Jun 21, 2020

[SPARK-29375][SPARK-28940][SPARK-32041][SQL] Whole plan exchange and subquery reuse #28885

Closed

Conversation

liancheng commented May 2, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented May 2, 2016

Uh oh!

SparkQA commented May 3, 2016

Uh oh!

SparkQA commented May 3, 2016

Uh oh!

liancheng May 3, 2016

Choose a reason for hiding this comment

Uh oh!

liancheng May 3, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented May 3, 2016

Uh oh!

SparkQA commented May 3, 2016

Uh oh!

SparkQA commented May 3, 2016

Uh oh!

hvanhovell May 3, 2016

Choose a reason for hiding this comment

Uh oh!

hvanhovell May 3, 2016

Choose a reason for hiding this comment

Uh oh!

hvanhovell commented May 3, 2016

Uh oh!

SparkQA commented May 4, 2016

Uh oh!

liancheng commented May 4, 2016

Uh oh!

gatorsmile May 12, 2016

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

liancheng May 3, 2016 •

edited

Loading